
In th® Claims 



The status of claims in the case is as follows: 



1 1. [Previously presented] A cache coherency system for a 

2 shared memory parallel processing system including a 

3 plurality of processing nodes, comprising: 

4 a single multi-stage communication network for 

5 interconnecting said processing nodes, said network 

6 including a dual priority switch at each node for 

7 selectively operating in normal low priority mode and 

8 camp-on high priority mode; 
9 

10 each said processing node including a unique section of 

11 shared memory which is not a cache; 

12 each said processing node including one or more caches 

13 for storing a plurality of cache lines; 

14 a cache coherency directory which is distributed to 

15 each of said nodes for tracking which of one or more of 

16 said nodes have copies of each cache line; and 
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17 an adapter for storing changed data immediately to said 

18 unique section of shared memory regardless of which of 

19 said nodes is changing the data and which of said nodes 

20 includes the section of shared memory to be changed, 

21 such that said shared memory always contains the most 

22 recent data according to a two hop process including in 

23 hop 1) a requesting node requests most recent data of a 

24 home node, and in hop 2) said home node immediately 

25 returns said most recent data from its shared memory to 

26 said requesting node. 

27 2. [Withdrawn] A sllared memory parallel processing 

28 system including a plurality of processing nodes, 

29 comprising: \ 

30 a multi-stage communication network for interconnecting 

31 said processing nodes,! s€ld network including a 

32 plurality of self-rou/tinq/ switches cascaded into first, 

33 middle and last sta^gsAel^ said switch including a 

34 plurality of switch inputs and a plurality of switch 

35 outputs, each of said switch outputs of each said 

36 switch coupled to a different switch input of others of 

37 said switches, switch outiputs of said last stage 

38 switches including network output ports, and switch 
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inputs of said f^rst stage switches comprising network 
input ports; 

each processing nbde including 



adapt 



a network 
messages with 
over said netiw 



\ / 45 



a local processor; 



:er for transmitting and receiving 
respect to other processing nodes 
/ork; 



r 



46 



at least one ifirivate write-through cache; 



47 
48 
49 
50 



a section of shared memory organized into a 
plurality of qache lines, each cache line 
including one pr more addressable memory 
locations; 



51 
52 



a cache coherency directory for tracking which of 
said nodes havel copies of each cache line; 



53 
54 
55 



said local processor Wt a first processing node being 
operable for writing dlata to said private cache at said 
first node, as the same data is written to either 
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shared memory at jsaid first node or sent over said 
network for writing to the shared memory and private 
cache of a second processing node. 

3. [Withdrawn] Thi shared memory parallel processing 
system of claim 2, wheiein said section of shared memory is 
divided into first and second portions, said first portion 
for storing unchangeable data, and said second portion for 
storing changeable data. 

4. [Withdrawn] Thef shared memory parallel processing 
system of claim 3, sa/d ckche coherency directory for this 
processing node listinq^Hich nodes of the plurality of 
nodes have accessed copies of said cache lines in said 
second portion of shared memory at this processing node. 



[Withdrawn] 



The shared memory parallel processing 
system of claim 4, wherein each said processing node is 
operable for reading, storing, and invalidating the shared 
memory at any of said plurality of processing nodes 
selectively by transmitting and receiving messages over said 
network, a first message (type for requesting the read of a 
cache line, a second messkge type for returning the 
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requested cache line, A third message type for storing a 
cache line, and a fourtph message type for invalidating a 
cache line. 



6. [Withdrawn] T 
system of claim 5, saijd 



a first buffer 
memory read 
and said second 



f<Jr transmitting to said network shared 
comm4nd messages of said first message type 
tfiessage type; 



a second buffer, 
memory store/cor 
type; 



a third buffer 
invalidate mess 
of said fourth m 



a fourth buffer 
memory read comma 



s shared memory parallel processing 
network adapter further comprising: 



ff3t transmitting to said network shared 
and messages of said third message 



f]or transmitting to said network 
alges for said cache coherency directory 
ssage type; 



or receiving from said network shared 
d messages of said first message type 



and said second message type; 



15 



a fifth buffer foil receiving from said network shared 
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memory store command messages of said third message 
type; and 



a sixth buffer 
invalidate mes 
of said fourth 



for 



sage 



1. [Withdrawn] A 
system, comprising : 

a plurality 
memory, at least 



receiving from said network 
s for said cache coherency directory 
mfessage type. 



switch including 



shared memory parallel processing 



of nodes, each node including a node 
one cache, and a memory controller; 
a multi-s^c-^sw^ network for 

interconnecting/^ aid processing nodes, said switching 
network including a plurality of self-routing switches 
cascaded into fiifst, middle and last stages, each said 

a plurality of switch inputs and a 
plurality of switch outputs, each of said switch 
outputs of each siaid switch coupled to a different 
switch input of others of said switches, switch outputs 
of said last stage switches including network output 
ports, and switch inputs of said first stage switches 
comprising network input ports; 

a system menory distributed to said node memories 
of said plurality of nodes and accessible by any node; 
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memory at this node 
access over said n 



each said node meirtbry being organized into a plurality 
of addressable word locations; 

said memory controller at this node operable for 
performing local mjemory access to the portion of system 

and for performing remote memory 
twork to the portion of system 
memory at other nodles; and 

a_ cache coherency controller at this node being 
local memory accesses and remote 
data stored in a word location of 
this node for caching accessed data 
node and for communicating data 



responsive to both 
memory accesses to 
said node memory at 
in the cache of 



for assuring cache^poherency throughout said system 
over said network. 



1 8. [Withdrawn] The shared memory processing system of 

2 claim 7, said system memotry being distributed in equal 

3 portions to each said node memory; and said node memory 

4 being further sub-divided linto a first memory section for 

5 storing data that is changeable and a second memory section 

6 for storing data that is unchangeable. 



9. [Withdrawn] The sheared memory processing system of 
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claim 7, further comprising node indicia for uniquely 
identifying each node. 



10. [Withdrawn] The 
claim 7, said cache 



shared memory processing system of 
controller further comprising: 



coherency 



an invalidation 
indicia identifying 
of each said cache 
time the cache lin<fe 



directory 



11. [Withdrawn] The 
claim 10, said cache cofte 
comprising: 



for storing a list of node 
those nodes having accessed a copy 
line of node memory since the last 
was changed. 




memory processing system of 
rency controller further 



an overflow directory for expanding said invalidation 
directory when the List of node indicia for a cache 
line becomes too loAg to contain entirely with said 
invalidation directory. 



12. [Withdrawn] A sha 
system, comprising : 



ed memory parallel processing 
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a plurality of nodes/, each node including a node 
memory, at least one cache, and a memory controller; 

a multi-stage switching network for interconnecting 
said processing nodes, said switching network including 
a plurality of self -routing switches cascaded into 
first, middle and last stages, each said switch 
including a plurality of switch inputs and a plurality 

ach of said switch outputs of each 
to a different switch input of 
es, switch outputs of said last 
network output ports, and 



of switch outputs, o 
said switch coupled 
others of said swit 
stage switches inc 



switch inputs of 
network input por\rh; 




stage switches comprising 



connection path acros 



a network adapter responsive to a node connection 
request for establish ing a connection path to a target 
node, first by attempting to establish a quick 

>k a plurality of segments of said 
switching network to said target node, and upon 
determining any one of] said plurality of segments is 



not available, issuing 
said target node. 



a camp-on connection request to 
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13. [Withdrawn] The shared memory parallel processing 
system of claim 12, further comprising: 



said plurality of nodes each coupled to one of the 



network output ports 
ports; 

each node further in 



and to one of the network input 



eluding: 



receive means for receiving a data message; and 

send means f^r js^nding^ message across an 

n-stage switching network from a local node to a 
remote node, sa:Ld send means generating said 
connection request including n sequential 
connection commands, each sequential connection 
command selecting one of said plurality of 
connection segments for each of the n switch 
stages of said network. 



14. [Withdrawn] The 
system of claim 12, each 
node connection requests 
establishing connection 



shared 



sa 



and 



memory parallel processing 
id switch being responsive to 
camp-on connection requests for 
s from any switch input port 



segment 
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15. [Withdrawn] The 
system of claim 14 , eaqh 



shared memory parallel processing 
said switch further comprising: 



a data bus for transferring said data message; 

a rejection controL line for signalling back to a 
sending node a rejection of any connection request; 



control 



an acceptance 
sending node the a 
request; 



line for signalling back to said 
c^eeptance of a camp-on connection 




me 



a valid control 1 
node the activation 

a camp-on control 1 
node the activation 

16, [Withdrawn] A bi 
interfacing a local node 



for receiving from said sending 
of a node connection request; and 

Lne for receiving from said sending 
of a camp-on connection request. 

-directional network adapter for 
of a shared memory parallel 



processing system to a multi-stage switching network for 
communications with a remote node, each said node including 



EN9997080B 



13 



S/N 09/394,564 



5 
6 
7 

8 
9 
10 

11 
12 
13 
14 

15 

16 
17 
18 

19 
20 

21 
22 



a node memory including A changeable portion and an 
unchangeable portion, anji a node cache; said network adapter 
comprising: 

a plurality of send buffers for storing and forwarding 



data messages from 
over said network, 



id 



a plurality of 
forwarding a plura 
remote node to sa 
network; 



said data messages 



receive buffers for storing and 

ity of data messages from said 
local node over said multi-stage 



an invalidati 
line that was 
cache line ha 



a read reques 
cache line 



a response mels 
the network tfc> 



said local node to said remote node 
and 




including: 



on message for invalidating a cache 
accessed by a remote node after said 
changed; 



t message for requesting access of a 
frbm a remote node; 



sage for returning a cache line over 
a remote node that has previously 
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requested data by a read request message; and 



24 a store message storing a changed cache line to a 

25 remote node. 

1 17. [Withdrawn] The network adapter of claim 16, said 

2 data messages further including a message header comprising: 



3 message type diffe 

! ^ 4 destination node i 

5 receiving said dat 

6 source node indipial 

7 transmitting said d 

8 message length indi 

9 of words included i: 

10 memory area indicia 

11 included in said da 

12 changeable area; 

13 time indicia for de 



entiation indicia; 

dicia for identifying a node for 
message over said network; 

for identifying a node for 
ita message over said network; 

:ia for defining the variable number 
l said data message; 

for defining whether memory words 
a message are read from said 

r Lning the time of generation of 
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said data message; and 



memory address indie 
location in memory o 
data message. 

18. [Withdrawn] The n 
send buffers further comp 

a read send FIFO for 
request messages and 
node to said remote 



a for defining the address 
the memory word included in said 



twork adapter of claim 17 , said 
rising: 

storing and forwarding read 

response messages from said local 



2 i(ode; 




a store send FIFO foi 
messages from said ideal 



an invalidation send 
invalidation messages 
remote node; 



and said receive buffers farther comprising: 



storing and forwarding store 

node to said remote node; and 



FIFO for storing and forwarding 
from said local node to said 



a read receive FIFO 
request messages and 



fq>r storing and forwarding read 
Response messages from said remote 
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node to said local node; 



15 
16 



a store receive FI1F0 for storing and forwarding store 
messages from saidl remote node to said local node; and 



17 
18 
19 



an invalidation receive FIFO for storing and forwarding 
invalidation messages from said remote node to said 
local node. 



3 
4 
5 
6 

• 7 
8 



19. [Withdrawn] The network adapter of claim 18, further 
comprising: 

a send FIFO select i/bnytmeen's for prioritizing the 
selection of a data message from one of said three send 
FIFO means for transmission to said network by first 
selecting data messages! from said invalidation send 
FIFO and thereafter alternatively selecting data 
messages from said read and store send FIFOs; 



9 
10 
11 
12 



a receive FIFO selection 'means responsive to said 
message type differentiation indicia for selecting one 
of said three receive FIFO means for storing a data 
message received from said\ network; and 
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f Lrst 



said network adapte(r 
connection request 
to a target node, 
quick connection pa 
said switching network 
determining any one 
not available, issuing 
said target node. 



20, 



being responsive to a node 
or establishing a connection path 

by attempting to establish a 
h across a plurality of segments of 

to said target node, and upon 
of said plurality of segments is 
a camp-on connection request to 



[Withdrawn] A memory controller for a local node of 



a shared memory parall 
including a node process 
an I/O adapter, said vayst 



processing system, said node 

a^aaae^ memory, a node cache and 
^m including a multi-stage 



switching network for communications amongst said local node 
and a plurality of remote nodes, said node memory including 
a changeable portion and an unchangeable portion; said 
memory controller comprising: 



9 
10 
11 



first means responsive to a request by said processor 
for access to a memory! word for first accessing said 
node cache of said local node; and 



12 



second means responsive! to said first means being 
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unable to access sai/d memory word in said node cache 
for accessing said ilemory word selectively from a cache 
line in said node memory or remote memory and storing 
said cache line to said node cache. 



21. [Withdrawn] The r|iemory controller of claim 20, 
further comprising: 

remote fetch interrupt means for issuing an interrupt 



signal to said node 



processor upon determining that a 



requested memory wori is located in remote memory for 
j said node pr<^s^o^ifc 



caus 



switch from a first 
instruction stre^rp/thread to a second instruction 
stream thread. 

22. [Withdrawn] The memory controller of claim 20 , 
further comprising: 

data message generation means responsive to a request 



from a remote node for 
identified by a remote 



accessing a cache line 
request read address for 



generating a read response message to return the 



accessed cache line to 



said remote node, said read 



EN9997080B 



S/N 09/394,564 



2- 



9 
10 

11 
12 
13 

14 
15 

16 
17 
18 

19 
20 



response message including a message header comprising 



message dif f eretftiat 
read request me£ 



destination node 
segment of said 
memory word; 



message lengtt 
request mes; 
header only; v 



ion indicia for defining said 
sage type; 



indicia equal to the sector 
node memory for said addressed 



source node indicia set to the node ID number of 
the local node; 




for defining said read 

<5omprised of said message 



memory address indicia for specifying the memory 
address of saidl memory word; 



21 
22 
23 
24 
25 



said data message generation means further operable for 
delivering said read [response message to a read send 
FIFO of said network adapter for transmission to said 
network and the remotfe node selected by said 
destination node indicia. 
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23. [Withdrawn] The memory controller of claim 20, 
further comprising: 



an invalidation directory; 



cast-out means for 
cache when said cach£ 
new cache line to be 
sending the address <b 
invalidation director 
has a copy of sai 



24. [Withdrawn] Tf 
further comprising: 



deleting 



a cache line from said node 
is full to provide space for a 
stored to said cache; and for 
f the deleted cache line to said 
y_to indicate said node no longer 
line. 




memory controller of claim 23, 



cast-out message 
cast-out means 
remote node for 
remote node to send t 
node ID number to 



deleting 



generating 



said 



generat 



ion means responsive to said 
a cache line addressed to a 
a cast-out message to said 
le cast-out address and the local 
remote node over said network; 



cast-out message rece 
cast-out address and 



ving means for delivering a 
';he source node ID number from the 
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message header of a \cast-out message to said 
invalidation directory. 

25. [Withdrawn] The 4 emor y controller of claim 20, 
further comprising: 



cache copy update 
messages to update 
nodes having copies 



msans 



for sending cache update 
corresponding cache lines all remote 
of a changed cache line; and 



sace 



cache update mes 
cache line of data 
received from a/ren 



26. [Withdrawn] The 
claim 16, said data mes; 

a cast-out message 



receiving means for replacing a 
with an updated cache line of data 
oibe node. 



bi-directional network adapter of 
sjages further comprising: 

for invalidating an invalidation 



directory entry at a remote node for this local node; 

a cache copy update message for updating copies of a 
changed cache line at this local node at remote nodes 
having copies of saxd changed cache line; and 
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a node indicia assi 
different node number 
of the system. 



igr^ment message for sending a 

to each of the plurality of nodes 



system, said node includir 



27. [Withdrawn] A method for operating memory controller 
for a local node of a shared memory parallel processing 

g a node processor, a node memory, 
a node cache and an I/O adapter, said system including a 
multi-stage switching network for communications amongst 
said local node and a plurality of remote nodes, said node 
memory including a changeable portion and an unchangeable 
portion; the method comprising the steps of: 



responsive to a re^ures 
a memory word, acces 
node; and thereafter 




t by said processor for access to 
said node cache of said local 



s ing 



responsive to said fi::st means being unable to access 
said memory word in said node cache, accessing said 

.y from a cache line in said node 
memory or remote memory and storing said cache line to 
said node cache. 



28. [Withdrawn] 



The method of claim 27, further 
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comprising the step of 



3 issuing an interrupt signal to said node processor upon 

4 determining that a Irequested memory word is located in 

5 remote memory for clausing said node processor to switch 

6 from a first instruction stream thread to a second 

7 instruction stream thread. 



1 29. [Withdrawn] A method for operating bi-directional 

2 network adapter for interfacing a local node of a shared 

3 memory parallel processing system to a multi-stage switching 

4 network for communications with a remote node, each said 

5 node including a node memory including a changeable portion 

6 and an unchangeable porjzfiop,, and a node cache; comprising 

7 the steps of: 



8 
9 
10 



operating a plurality lof send buffers for storing and 
forwarding data messages from said local node to said 
remote node over said metwork, and 



11 
12 
13 
14 



operating a plurality op receive buffers for storing 
and forwarding a plurality of data messages from said 
remote node to said loc^l node over said multi-stage 
network; 
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15 said data messages including: 

16 an invalidation message for invalidating a cache 

17 line that was accessed by a remote node after said 

18 cache line has changed; 

19 a read request message for requesting access of a 

20 cache line from a remote node; 



21 

"> 22 
y 23 



a response message for returning a cache line over 
the network to a remote node that has previously 
requested data by a read request message; and 



24 
25 



a store message 
remote node. 



fg a changed cache line to a 



1 30. [Withdrawn] The method of claim 29, further 

2 comprising the steps of: 



3 
4 
5 



operating a read send FIFO for storing and forwarding 
read request messages end response messages from said 
local node to said remote node; 



operating a store send 



FIFO for storing and forwarding 
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7 store messages fr<j>m said local node to said remote 

8 node; and 

9 operating an invalidation send FIFO for storing and 

10 forwarding invalidation messages from said local node 

11 to said remote node; 

12 operating a read rfeceive FIFO for storing and 

13 forwarding read request messages and response messages 

14 from said remote ndde tp^aid local node; 

15 operating a store rfeceive FIFO for storing and 

16 forwarding store messages from said remote node to said 

17 local node; and 1 

18 operating an invalidation receive FIFO for storing and 

19 forwarding invalidation messages from said remote node 

20 to said local node. 1 



1 31. [Previously presented] A method for operating a shared 

1^ 2 memory parallel processing system as a cache coherency 

3 system including a plurality of processing nodes, each said 

4 processing node including a unique section of shared memory 

5 which is not a cache, comprising the steps of: 
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6 interconnecting said processing nodes through a single 

7 multi-stage communication network, said network 

8 including a dual priority switch at each node for 

9 selectively operating in normal low priority mode and 

10 camp-on high priority mode; 

11 storing at each said processing node a plurality of 

12 cache lines in one or more caches; 

13 distributing to each of said processing nodes a cache 

14 coherency directory; 

15 tracking in said cache coherency directory which of 

16 said one or more of said processing nodes have copies 

17 of each cache line; and 

18 changing said shared memory according to a two hop 

19 process including in hop 1) a requesting node 

20 requests most recent data of a home node, and in hop 

21 2) said home node immediately returns said most 

22 recent data from its shared memory to said requesting 

23 node, wherein changed data is stored immediately to 

24 said unique section of shared memory regardless of 

25 which of said nodes is changing the data and which of 
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26 said nodes includes the section of shared memory to 

27 be changed, wherein said shared memory always 

28 contains the most recent data. 

1 32. [Previously presented] A program storage device 

2 readable by a machine , tangibly embodying a program of 

3 instructions executable by a machine to perform method steps 

4 for operating a shared memory parallel processing system 

5 including a plurality of processing nodes, each said 

6 processing node including a unique section of shared memory 

7 which is not a cache, said method steps comprising: 

8 interconnecting said processing nodes through a single 

9 multi-stage communication network, said network 

10 including a dual priority switch at each node for 

11 selectively operating in normal low priority mode and 

12 camp-on high priority mode; 

13 storing at each said processing node a plurality of 

14 cache lines in one or more caches; 

15 tracking in a cache coherency directory which is 

16 distributed to each of said processing nodes which of 

17 one or more of said processing nodes have copies of 
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18 each cache line; and 

19 changing said unique section of shared memory according 

20 to a two hop process including in hop 1) a requesting 

21 node requests most recent data of a home node, and in 

22 hop 2) said home node immediately returns said most 

23 recent data from its shared memory to said requesting 

24 node, wherein changed data is stored immediately to 

25 shared memory regardless of which of said nodes is 

26 changing the data and which of said nodes includes the 

27 section of shared memory to be changed, wherein said 

28 shared memory always contains the most recent data. 

1 33. [Previously presented] An article of manufacture 

2 comprising: 

3 a computer useable medium having computer readable 

4 program code means embodied therein for operating a 

5 shared memory parallel processing system including a 

6 plurality of processing nodes, each said processing 

7 node including a unique section of shared memory which 

8 is not a cache, the computer readable program means in 

9 said article of manufacture comprising: 
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10 computer readable program code means for causing a 

11 computer to effect interconnecting said processing 

12 nodes through a multi-stage communication network, said 

13 network including a dual priority switch at each node 

14 for selectively operating in normal low priority mode 

15 and camp-on high priority mode; 

16 computer readable program code means for causing a 

17 computer to effect storing at each said processing node 

18 a plurality of cache lines in one or more caches; 

19 computer readable program code means for causing a 

20 computer to effect tracking in a cache coherency 

21 directory which is distributed to each of said 

22 processing nodes which of said processing nodes have 

23 copies of each cache line; and 

24 computer readable program code means for storing 

25 changed data immediately to said unique section of 

26 shared memory regardless of which of said nodes is 

27 changing the data and which of said nodes includes the 

28 section of shared memory to be changed according to a 

29 two hop process including in hop 1) a requesting node 

30 requests most recent data of a home node, and in hop 2) 
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31 said home node immediately returns said most recent 

32 data from its shared memory to said requesting node, 

33 such that said shared memory always contains the most 

34 recent data. 

1 34. [Previously presented] A computer program product or 

2 computer program element for operating a shared memory 

3 parallel processing system including a plurality of 

4 processing nodes, each said node including a unique section 

5 of shared memory which is not a cache, according to the 

6 steps of: 

7 interconnecting said processing nodes through a single 

8 multi-stage communication network, said network 

9 including a dual priority switch at each node for 

10 selectively operating in normal low priority mode and 

11 camp-on high priority mode; 

12 storing at each said processing node a plurality of 

13 cache lines in one or more caches; 

14 distributing to each of said processing nodes a cache 

15 coherency directory; 
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16 tracking in said cache coherency directory which of 

17 said processing nodes have copies of each cache line; 

18 and 

19 storing changed data immediately to said unique section 

20 of shared memory regardless of which of said nodes is 

21 changing the data and which of said nodes includes the 

22 section of shared memory to be changed according to a 

23 two hop process including in hop 1) a requesting node 

24 requests most recent data of a home node, and in hop 2) 

25 said home node immediately returns said most recent 
data from its shared memory to said requesting node 

27 such that said shared memory always contains the most 

28 recent data. 

1 35. [Original] The cache coherency system of claim 1, 

2 further comprising: 

3 a shared memory including a first memory portion for 

4 storing unchangeable data and a second memory portion 

5 for storing changeable data; and 

6 said cache coherency directory listing which nodes of 

7 said plurality of processing nodes have accessed copies 
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8 of said cache lines in said second memory portion. 

1 36. [Original] The cache coherency system of claim 35, 

2 each of said plurality of processing nodes being operable 

3 for reading, storing, and invalidating said shared memory at 

4 any other of said processing nodes. 

1 37. [Previously presented] The cache coherency system of 

2 claim 36, further comprising at a first node of said 

3 plurality of processing nodes a memory controller 

4 selectively operable first responsive to a request for 

5 access to a memory word by first accessing the cache at 

6 said first node and, if said requested memory word is not 

7 available in said cache, selectively operable second for 

8 accessing said memory word selectively from said shared 

9 memory regardless of which of said nodes includes the 

10 section of shared memory being accessed, and storing said 

11 cache line including said memory word to said cache at said 

12 first node. 

1 38. [Previously presented] The cache coherency system of 

2 claim 37, said memory controller further being selectively 

3 operable for deleting a cache line from said cache at said 

4 first node when said cache is full to provide space for a 
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5 new cache line to be stored to said cache, and for sending 

6 the address of the deleted cache line to an invalidation 

7 directory to indicate said node no longer has a copy of said 

8 cache line. 

1 39. [Previously presented] The cache coherency system of 

2 claim 37, said memory controller further being selectively 

3 operable for sending cache update messages to update 

4 corresponding cache lines at all remote nodes having copies 

5 of a changed cache line and for receiving cache lines of 

6 data from remote nodes for updating the cache at said first 

7 node . 

1 40. [New] A cache coherency system for a shared memory 

2 parallel processing system including a plurality of 

3 processing nodes, comprising: 

4 a multi-stage communication network for interconnecting 

5 said processing nodes; 
6 

7 each said processing node including a unique section of 

8 shared memory which is not a cache; 

9 each said processing node including one or more caches 
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10 for storing a plurality of cache lines; 

11 a cache coherency directory which is distributed to 

12 each of said nodes for tracking which of said nodes 

13 have copies of each cache line; and 

14 an adapter for storing changed data immediately to said 

15 unique section of shared memory regardless of which of 

16 said nodes is changing the data and which of said nodes 

17 includes the section of shared memory to be changed, 

18 such that said shared memory always contains the most 
recent data. 

1 41. [New] A method for operating a shared memory parallel 

2 processing system as a cache coherency system including a 

3 plurality of processing nodes, each said processing node 

4 including a unique section of shared memory which is not a 

5 cache, comprising the steps of: 

6 interconnecting said processing nodes through a multi- 

7 stage communication network; 

8 storing at each said processing node a plurality of 

9 cache lines in one or more caches; 
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distributing to each of said processing nodes a cache 
coherency directory; 



12 tracking in said cache coherency directory which of 

13 said processing nodes have copies of each cache line; 

14 and 

15 changing said shared memory, wherein changed data is 

16 stored immediately to said unique section of shared 

17 memory regardless of which of said nodes is changing 

18 the data and which of said nodes includes the section 

19 of shared memory to be changed, wherein said shared 

20 memory always contains the most recent data. 

1 42. [New] A program storage device readable by a machine, 

2 tangibly embodying a program of instructions executable by a 

3 machine to perform method steps for operating a shared 

4 memory parallel processing system including a plurality of 

5 processing nodes, each said processing node including a 

6 unique section of shared memory which is not a cache, said 

7 method steps comprising: 

8 interconnecting said processing nodes through a multi- 

9 stage communication network; 
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10 storing at each said processing node a plurality of 

11 cache lines in one or more caches; 

12 tracking in a cache coherency directory which is 

13 distributed to each of said processing nodes which of 

14 said processing nodes have copies of each cache line; 

15 and 

16 changing said unique section of shared memory, wherein 

17 changed data is stored immediately to shared memory 

18 regardless of which of said nodes is changing the data 

19 and which of said nodes includes the section of shared 

20 memory to be changed, wherein said shared memory always 

21 contains the most recent data. 

1 43. [New] An article of manufacture comprising: 

2 a computer useable medium having computer readable 

3 program code means embodied therein for operating a 

4 shared memory parallel processing system including a 

5 plurality of processing nodes, each said processing 

6 node including a unique section of shared memory which 

7 is not a cache, the computer readable program means in 

8 said article of manufacture comprising: 
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9 computer readable program code means for causing a 

10 computer to effect interconnecting said processing 

11 nodes through a multi-stage communication network; 

12 computer readable program code means for causing a 

13 computer to effect storing at each said processing node 

14 a plurality of cache lines in one or more caches; 

15 computer readable program code means for causing a 

16 computer to effect tracking in a cache coherency 

17 directory which is distributed to each of said 
^2^18 processing nodes which of said processing nodes have 

19 copies of each cache line; and 

20 computer readable program code means for storing 

21 changed data immediately to said unique section of 

22 shared memory regardless of which of said nodes is 

23 changing the data and which of said nodes includes the 

24 section of shared memory to be changed such that said 

25 shared memory always contains the most recent data. 

1 44. [New] A computer program product or computer program 

2 element for operating a shared memory parallel processing 

3 system including a plurality of processing nodes, each said 
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4 node including a unique section of shared memory which is 

5 not a cache, according to the steps of: 

6 interconnecting said processing nodes through a multi- 

7 stage communication network; 

8 storing at each said processing node a plurality of 

9 cache lines in one or more caches; 

10 distributing to each of said processing nodes a cache 

11 coherency directory; 

12 tracking in said cache coherency directory which of 

13 said processing nodes have copies of each cache line; 

14 and 

15 storing changed data immediately to said unique section 

16 of shared memory regardless of which of said nodes is 

17 changing the data and which of said nodes includes the 

18 section of shared memory to be changed such that said 

19 shared memory always contains the most recent data. 

1 45. [New] The cache coherency system of claim 40, further 

2 comprising: 
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3 a shared memory including a first memory portion for 

4 storing unchangeable data and a second memory portion 

5 for storing changeable data; and 



6 said cache coherency directory listing which nodes of 

7 said plurality of processing nodes have accessed copies 

8 of said cache lines in said second memory portion. 



1 46. [New] The cache coherency system of claim 45, each of 

2 said plurality of processing nodes being operable for 

3 reading, storing, and invalidating said shared memory at any 

4 other of said processing nodes. 

1 47. [New] The cache coherency system of claim 46, further 

2 comprising at a first node of said plurality of processing 

3 nodes a memory controller selectively operable first 

4 responsive to a request for access to a memory word by 

5 first accessing the cache at said first node and, if said 

6 requested memory word is not available in said cache, 

7 selectively operable second for accessing said memory word 

8 selectively from said shared memory regardless of which of 

9 said nodes includes the section of shared memory being 

10 accessed, and storing said cache line including said memory 

11 word to said cache at said first node. 
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1 48. [New] The cache coherency system of claim 47, said 

2 memory controller further being selectively operable for 

3 deleting a cache line from said cache at said first node 

4 when said cache is full to provide space for a new cache 

5 line to be stored to said cache, and for sending the address 

6 of the deleted cache line to an invalidation directory to 

7 indicate said node no longer has a copy of said cache line. 

1 49. [New] The cache coherency system of claim 47, said 

2 memory controller further being selectively operable for 

3 sending cache update messages to update corresponding cache 

4 lines at all remote nodes having copies of a changed cache 

5 line and for receiving cache lines of data from remote nodes 

6 for updating the cache at said first node. 
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